Goto

Collaborating Authors

 Encinitas


Task Oriented In-Domain Data Augmentation

Liang, Xiao, Hu, Xinyu, Zuo, Simiao, Gong, Yeyun, Lou, Qiang, Liu, Yi, Huang, Shao-Lun, Jiao, Jian

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. However, existing approaches suffer from two major issues. First, in-domain data are scarce compared to general domain-agnostic data. Second, data used for continual pre-training are not task-aware, such that they may not be helpful to downstream applications. We propose TRAIT, a task-oriented in-domain data augmentation framework. Our framework is divided into two parts: in-domain data selection and task-oriented synthetic passage generation. The data selection strategy identifies and selects a large amount of in-domain data from general corpora, and thus significantly enriches domain knowledge in the continual pre-training data. The synthetic passages contain guidance on how to use domain knowledge to answer questions about downstream tasks. We adapt LLMs to two domains: advertisement and math. On average, TRAIT improves LLM performance by 8% in the advertisement domain and 7.5% in the math domain. Large language models (LLMs) have achieved significant performance improvements in various applications such as language modeling (Brown et al., 2020; Touvron et al., 2023; Chowdhery et al., 2023) and visual understanding (Radford et al., 2021). They have also shown superior performance in fields such as finance (Xie et al., 2023b), e-commerce (Ma et al., 2023) and healthcare (Bakhshandeh, 2023). However, the models are usually trained on a large amount of general domain-agnostic data, such as web corpora. Because of the lack of domain-specific training, LLMs suffer from subpar performance when directly applied to certain domains such as advertisement. To adapt LLMs to a specific domain, continual pre-training methods (Gururangan et al., 2020) are commonly applied. In particular, the LLM is continual pre-trained on in-domain corpora, such that it can acquire domain knowledge and better adapt to downstream tasks.


Intelligent Software Tooling for Improving Software Development

Cooper, Nathan

arXiv.org Artificial Intelligence

Software has eaten the world with many of the necessities and quality of life services people use requiring software. Therefore, tools that improve the software development experience can have a significant impact on the world such as generating code and test cases, detecting bugs, question and answering, etc. The success of Deep Learning (DL) over the past decade has shown huge advancements in automation across many domains, including Software Development processes. One of the main reasons behind this success is the availability of large datasets such as open-source code available through GitHub or image datasets of mobile Graphical User Interfaces (GUIs) with RICO [112] and ReDRAW [267] to be trained on. Therefore, the central research question my dissertation explores is: In what ways can the software development process be improved through leveraging DL techniques on the vast amounts of unstructured software engineering artifacts? We coin the approaches that leverage DL to automate or augment various software development task as Intelligent Software Tools.


Opportunities and Risks of LLMs for Scalable Deliberation with Polis

Small, Christopher T., Vendrov, Ivan, Durmus, Esin, Homaei, Hadjar, Barry, Elizabeth, Cornebise, Julien, Suzman, Ted, Ganguli, Deep, Megill, Colin

arXiv.org Artificial Intelligence

Polis is a platform that leverages machine intelligence to scale up deliberative processes. In this paper, we explore the opportunities and risks associated with applying Large Language Models (LLMs) towards challenges with facilitating, moderating and summarizing the results of Polis engagements. In particular, we demonstrate with pilot experiments using Anthropic's Claude that LLMs can indeed augment human intelligence to help more efficiently run Polis conversations. In particular, we find that summarization capabilities enable categorically new methods with immense promise to empower the public in collective meaning-making exercises. And notably, LLM context limitations have a significant impact on insight and quality of these results. However, these opportunities come with risks. We discuss some of these risks, as well as principles and techniques for characterizing and mitigating them, and the implications for other deliberative or political systems that may employ LLMs. Finally, we conclude with several open future research directions for augmenting tools like Polis with LLMs.


Semantically Enhanced Dynamic Bayesian Network for Detecting Sepsis Mortality Risk in ICU Patients with Infection

Wang, Tony, Velez, Tom, Apostolova, Emilia, Tschampel, Tim, Ngo, Thuy L., Hardison, Joy

arXiv.org Machine Learning

Although timely sepsis diagnosis and prompt interventions in Intensive Care Unit (ICU) patients are associated with reduced mortality, early clinical recognition is frequently impeded by nonspecific signs of infection and failure to detect signs of sepsis-induced organ dysfunction in a constellation of dynamically changing physiological data. The goal of this work is to identify patient at risk of life-threatening sepsis utilizing a data-centered and machine learning-driven approach. We derive a mortality risk predictive dynamic Bayesian network (DBN) guided by a customized sepsis knowledgebase and compare the predictive accuracy of the derived DBN with the Sepsis-related Organ Failure Assessment (SOFA) score, the Quick SOFA (qSOFA) score, the Simplified Acute Physiological Score (SAPS-II) and the Modified Early Warning Score (MEWS) tools. A customized sepsis ontology was used to derive the DBN node structure and semantically characterize temporal features derived from both structured physiological data and unstructured clinical notes. We assessed the performance in predicting mortality risk of the DBN predictive model and compared performance to other models using Receiver Operating Characteristic (ROC) curves, area under curve (AUROC), calibration curves, and risk distributions. The derived dataset consists of 24,506 ICU stays from 19,623 patients with evidence of suspected infection, with 2,829 patients deceased at discharge. The DBN AUROC was found to be 0.91, which outperformed the SOFA (0.843), qSOFA (0.66), MEWS (0.729), and SAPS-II (0.766) scoring tools. Continuous Net Reclassification Index and Integrated Discrimination Improvement analysis supported the superiority DBN with respect to SOFA, qSOFA, MEWS, and SAPS-II. Compared with conventional rule-based risk scoring tools, the sepsis knowledgebase-driven DBN algorithm offers improved performance for predicting mortality of infected patients in intensive care units.